As of 06-04-2020, more than 1,133,000 cases of COVID-19 have been reported across the world, with more than 62,000 deaths (1). This weekly report presents forecasts of the reported number of deaths in the week ahead and analysis of case reporting trends (case ascertainment) for 42 countries with active transmission.
The accuracy of these forecasts vary with the quality of surveillance and reporting in each country. We use the reported number of deaths due to COVID-19 to make these short-term forecasts as these are likely more reliable and stable over time than reported cases. In countries with poor reporting of deaths, these forecasts will likely represent an under-estimate while the forecasts for countries with few deaths might be unreliable. Our estimates of transmissibility reflect the epidemiological situation at the time of the infection of COVID-19 fatalities. Therefore, the impact of controls on estimated transmissibility will be quantifiable with a delay between transmission and death.
Forecasts and Transmissibility Estimates
Based on our best estimates of transmissibility, the COVID-19 epidemic is:
Based on the central trends in the forecasts, the total number of reported deaths in the coming week is expected to be:
The main objective in this report is to produce forecasts of the number of deaths in the week ahead for each country with active transmission.
We define a country as having active transmission if at least ten deaths were observed in the country in each of the past two weeks. In the analysis for week beginning 29-03-2020, 22 countries/regions were included in the analysis. For the week beginning 05-04-2020, the number of countries/regions included based on these thresholds is 42.
We forecast the number of potential deaths as the reporting of deaths is likely to be more reliable and stable over time than the reporting of cases.
As we are forecasting deaths, the latest estimates of transmissibility reflect the epidemiological situation at the time of the infection of COVID-19 fatalities. Therefore, the impact of controls on estimated transmissibility will be quantifiable with a delay between transmission and death.
A secondary objective of this report is to analyse case ascertainment per country. As well as forecasting ahead, we use the number of reported deaths and of cases reported with a delay (delay from reporting to deaths, see Case Ascertainment method) to analyse the reporting trends per country. If the reporting of cases and deaths were perfect, and the delay between reporting and death is known, the ratio of deaths to delayed cases would equal the Case Fatality Ratio (CFR).
In this analysis, key assumptions are:
Key results below are based on an ensemble forecast of two models.
Transmissibility is characterised by the reproduction number \(R_t\), i.e. the average number of cases that one infected individual is likely to infect. Analysis of transmissibility indicates that the reproduction numbers last week (week starting 05-04-2020) were highest in:
and were lowest in:
Forecasts of predicted deaths in the coming week (week starting 05-04-2020) are highest in:
and are lowest in:
Forecasts in previous weeks performed well, with 67.7% of the observed daily number of deaths across all countries included in the 95% CrI of the forecast intervals.
Case ascertainment was estimated based on the deaths in the previous 2 weeks and reported cases in the 10 days prior to that period. Estimates of case ascertainment were highly variable and, due to the underlying assumption of perfect reporting, are likely to be an underestimate. In particular, community deaths due to COVID-19 are likely under-reported (3). Results indicate that, assuming perfect reporting of deaths, the countries with the highest case ascertainment were:
and again assuming perfect reporting of deaths, the countries with the lowest case ascertainment were:
Based on the estimated ascertainment, we estimated the true size of the epidemic in each country in the previous 7 days (week starting 22-03-2020). Countries with the largest true epidemic size in this period were:
and countries with the lowest true epidemic size in this period were:
We define a country to have active transmission if at least ten deaths were observed in the country in the last two consecutive weeks. We intend to produce forecasts every week, for the week ahead. Ensemble forecasts are produced from the outputs of three different models.
Our main analysis assumes a gamma distributed serial interval with mean 6.48 days and standard deviation of 3.83 days following (4). The serial interval estimates observed from various studies thus far may be biased toward lower values due to observation bias whereby, in contact tracing studies, long serial intervals tend to be under-represented. To account for this, as a sensitivity analysis, we also use a shorter serial interval of mean 4.80 days and standard deviation of 2.70 days (5). Results using this shorter interval are presented in the section Sensitivity Analyses. While using a longer serial interval has very little impact on the weekly forecasts produced, it results in much higher estimates of transmissibility.
Figure 1 Serial Interval distributions used in the analysis. Here the serial interval relates to death and characterise the time between the deaths of an infector their infectee. Our main analysis assumes a gamma distribution with a mean of 6.48 days and a standard deviation of 3.83 days (shown in green). The shorter serial interval, used for sensitivity analysis, with a mean of 4.80 days and a standard deviation of 2.70 days is shown in purple.
This is an unweighted ensemble of Models 1, 2 and 3. We obtained posterior distribution for all estimated reproduction numbers and forecasted deaths by simply combining the posterior distributions of each model.
Ensemble models, even if built with a relatively simple approach such as adopted here, have been shown to typically perform better than individual models in the context of epidemiology of infectious diseases (6).
Current and past forecasts
Caution note: We note that in France, a large increase in deaths was reported towards the end of the week starting 30-03-2020. This is largely due to back-reporting of deaths outside hospital settings, and therefore, this is likely to have inflated the estimates \(R_t\). The forecasts of deaths for the coming week are thefore likely to be over-estimated.
Figure 2: Reported daily deaths, current and past forecasts based on the ensemble model. For each country with active transmission (see methods), we plot the observed incidence of deaths (black dots). Past forecasts, where available, are shown in green (median and 95% CrI), while latest forecasts are shown in red (median and 95% CrI). Vertical dashed lines show the start and end of each week (Monday to Sunday).
Figure 3: Latest estimates of effective reproduction numbers by country (median and 95% CrI). We present the estimates of current transmissibility estimated from each method as well as the ensemble estimates.
Table 1: Observed (where available) and forecasted weekly death counts and the estimated levels of transmissibility from the ensemble model for each country with active transmission (see Methods) and for each period for which forecasts were produced. The number of deaths has been rounded to 3 significant figures.
The approach, similar to model 2, was to estimate the current reproduction number (the average number of secondary cases generated by a typical infected individual, \(R_t\)) and to use that to forecast future incidence of death. The current reproduction number was estimated assuming constant transmissibility during a chosen time-window (i.e. one week).
Estimating current transmissibility
Here we relied on a well-established and simple method (7) that assumed the daily incidence, It (here representing deaths), could be approximated with a Poisson process following the renewal equation (8):
\[I_t \sim Pois\left( R_t \sum_{s=0}^tI_{t-s}w_s\right)\]
where \(R_t\) is the instantaneous reproduction number and \(w\) is the serial interval distribution. From this a likelihood of the data given a set of model parameters can be calculated, as well the posterior distribution of \(R_t\) given previous observations of incidence and knowledge of the serial interval (9).
We used this approach to estimate \(R_t\) over three alternative time-windows defined by assuming a constant \(R_t\) for either the 2, 3 or 4 weeks prior to the most recent data-point. We made no assumptions regarding the epidemiological situation and transmissibility prior to each time-window. Therefore, no data prior to the time-window were used to estimate \(R_t\), and instead we jointly estimated \(R_t\) as well as back-calculated the incidence before the time-window. Specifically, we jointly estimated the \(R_t\) and the incidence level 100 days before the time-widow. Past incidence was then calculated using the known relationship between the serial interval, growth rate and reproduction number. The joint posterior distribution of \(R_t\) and the early epidemic curve (from which forecasts will be generated) were inferred using Markov Chain Monte Carlo (MCMC) sampling.
The model has the advantage of being robust to changes in reporting before the time-window used for inference.
Forward projections
We used the renewal equation (8) to project the incidence forward, given a back-calculated early incidence curve, an estimated reproduction number, and the observed incidence over the calibration period. We sampled sets of back-calculated early incidence curves and reproduction numbers from the posterior distribution obtained in the estimation process. For each of these sets, we simulated stochastic realisations of the renewal equation from the end of the calibration period leading to projected incidence trajectories.
Projections were made on a 7-day horizon. The transmissibility is assumed to remain constant over this time period. If transmissibility were to decrease as a result of control interventions and/or changes in behaviour over this time period, we would predict fewer deaths; similarly, if transmissibility were to increase over this time period, we would predict more deaths We limited our projection to 7 days only as assuming constant transmissibility over longer time horizons seemed unrealistic in light of the different interventions implemented by different countries and potential voluntary behaviour changes.
Current and past forecasts
Caution note: We note that in France, a large increase in deaths was reported towards the end of the week starting 30-03-2020. This is largely due to back-reporting of deaths outside hospital settings, and therefore, this is likely to have inflated the estimates \(R_t\). The forecasts of deaths for the coming week are thefore likely to be over-estimated.
Figure 4: Reported daily deaths, current and past forecasts based on model 1. For each country with active transmission (see Methods), we plot the observed incidence of deaths (black dots). Past forecasts, where available, are shown in green (median and 95% CrI), while latest forecasts are shown in red (median and 95% CrI). Vertical dashed lines show the start and end of each week (Monday to Sunday).
Figure 5: Latest estimates of effective reproduction numbers by country (median and 95% CrI). We present the estimates of current transmissibility estimated from model 1.
Table 2: Observed (where available) and forecasted weekly death counts, and estimated levels of transmissibility from Model 1 for each country with active transmission (see methods) and for each period for which forecasts were produced. The number of deaths has been rounded to 3 significant figures.
Estimating current transmissibility
The standard approach to inferring the effective reproduction number at \(t\), \(R_t\), from an incidence curve (with cases at t denoted It) is provided by (9). This method assumes that \(R_t\) is constant over a window back in time of size k units (e.g. days or weeks) and uses the part of the incidence curve contained in this window to estimate \(R_t\). However, estimates of \(R_t\) can depend strongly on the width of the time-window used for estimation. Thus mis-specified time-windows can bias our inference. In (10) we use information theory to extend the approach of Cori et al. to optimise the choice of the time-window and refine estimates of \(R_t\). Specifically:
We integrate over the entire posterior distribution of \(R_t\), to obtain the posterior predictive distribution of incidence at time t+1 as P(It+1 | I1t) with I1t as the incidence curve up to t. For a gamma posterior distribution over \(R_t\) this is analytic and negative binomial ((10) for exact formulae).
We compute this distribution sequentially and causally across the existing incidence curve and then evaluate every observed case-count according to this posterior predictive distribution. For example at t = 5, we pick the true incidence value I5* and evaluate the probability of seeing this value under the predictive distribution i.e. P(I5 = I5* | I14).
This allows us to construct the accumulated predictive error (APE) under some window length k and under a given generation time distribution as:
\[\text{AP}E_{k} = \sum_{t = 0}^{T - 1}{- \log{P\left( I_{t + 1} = I_{t + 1}^{*}\ \right|\ I_{t - k + 1}^{t})\ \ }}\]
The optimal window length k* is then \(k^{*} = \arg{\min_{k}{\text{AP}E_{k}}}\). Here T is the last time point in the existing incidence curve.
Forward Projections
Forward projections are made assuming that the transmissibility remains unchanged over the projection horizon and same as the transmissibility in the last time-window. The projections are made using the standard branching process model using a Poisson offspring distribution.
Current and past forecasts
Caution note: We note that in France, a large increase in deaths was reported towards the end of the week starting 30-03-2020. This is largely due to back-reporting of deaths outside hospital settings, and therefore, this is likely to have inflated the estimates \(R_t\). The forecasts of deaths for the coming week are thefore likely to be over-estimated.
Figure 6: Reported daily deaths, current and past forecasts based on model 2. For each country with active transmission (see Methods), we plot the observed incidence of deaths (black dots). Past forecasts, where available, are shown in green (median and 95% CrI), while latest forecasts are shown in red (median and 95% CrI). Vertical dashed lines show the start and end of each week (Monday to Sunday).
Figure 7: Latest estimates of effective reproduction numbers by country (median and 95% CrI). We present the estimates of current transmissibility from model 2.
Table 3: Observed (where available) and forecasted weekly death counts and the estimated levels of transmissibility from Model 2 for each country with active transmission (see methods) and for each period for which forecasts were produced. The number of deaths has been rounded to 3 significant figures.
The methods for this model are presented in detail in the section “Case Ascertainment”. Please note that for this model, we do no estimate the effective reproduction number to forecast ahead.
Current and Past Forecasts